Parallel Dictionary Learning Using a Joint Density Restricted Boltzmann Machine for Sparse-representation-based Voice Conversion
نویسندگان
چکیده
In voice conversion, sparse-representation-based methods have recently been garnering attention because they are, relatively speaking, not affected by over-fitting or over-smoothing problems. In these approaches, voice conversion is achieved by estimating a sparse vector that determines which dictionaries of the target speaker should be used, calculated from the matching of the input vector and dictionaries of the source speaker. The sparse-representation-based voice conversion methods can be broadly divided into two approaches: (1) an approach that uses raw acoustic features in the training data as parallel dictionaries, and (2) an approach that trains parallel dictionaries from the training data. Our approach belongs to the latter; we systematically estimate the parallel dictionaries using a restricted Boltzmann machine, a fundamental technology commonly used in Toru Nakashika, Tetsuya Takiguchi and Yasuo Ariki 102 deep learning. Through voice conversion experiments, we confirmed the high-performance of our method, comparing it with the conventional Gaussian mixture model (GMM)-based approach, and a non-negative matrix factorization (NMF)-based approach, which is based on sparse representation.
منابع مشابه
Deblocking Joint Photographic Experts Group Compressed Images via Self-learning Sparse Representation
JPEG is one of the most widely used image compression method, but it causes annoying blocking artifacts at low bit-rates. Sparse representation is an efficient technique which can solve many inverse problems in image processing applications such as denoising and deblocking. In this paper, a post-processing method is proposed for reducing JPEG blocking effects via sparse representation. In this ...
متن کاملA Hybrid Algorithm based on Deep Learning and Restricted Boltzmann Machine for Car Semantic Segmentation from Unmanned Aerial Vehicles (UAVs)-based Thermal Infrared Images
Nowadays, ground vehicle monitoring (GVM) is one of the areas of application in the intelligent traffic control system using image processing methods. In this context, the use of unmanned aerial vehicles based on thermal infrared (UAV-TIR) images is one of the optimal options for GVM due to the suitable spatial resolution, cost-effective and low volume of images. The methods that have been prop...
متن کاملSpeech Enhancement using Adaptive Data-Based Dictionary Learning
In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...
متن کاملImage Classification via Sparse Representation and Subspace Alignment
Image representation is a crucial problem in image processing where there exist many low-level representations of image, i.e., SIFT, HOG and so on. But there is a missing link across low-level and high-level semantic representations. In fact, traditional machine learning approaches, e.g., non-negative matrix factorization, sparse representation and principle component analysis are employed to d...
متن کاملJoint spectral distribution modeling using restricted boltzmann machines for voice conversion
This paper presents a new spectral modeling and conversion method for voice conversion. In contrast to the conventional Gaussian mixture model (GMM) based methods, we use restricted Boltzmann machines (RBMs) as probability density models to model the joint distributions of source and target spectral features. The Gaussian distribution in each mixture of GMM is replaced by an RBM, which can bett...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014